Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost
نویسنده
چکیده
Despite the limitations imposed by the proportional hazards assumption, the Cox model is probably the most popular statistical tool used to analyze survival data, thanks to its flexibility and ease of interpretation. For this reason, novel statistical/machine learning techniques are usually adapted to fit it, including boosting, an iterative technique originally developed in the machine learning community and later extended to the statistical field. The popularity of boosting has been further driven by the availability of user-friendly software such as the R packages mboost and CoxBoost, both of which allow the implementation of boosting in conjunction with the Cox model. Despite the common underlying boosting principles, these two packages use different techniques: the former is an adaption of the model-based boosting, while the latter adapts the likelihood-based boosting. Here we contrast these two boosting techniques as implemented in the R packages from an analytic point of view, and we examine the solutions there adopted to treat mandatory variables, i.e. variables that for some reasons must be included in the model. We explore the possibility of extending solutions currently only implemented in one package to the other. We illustrate the usefulness of these extensions through the application to two real data examples. keywords: boosting, CoxBoost, Cox model, mandatory variables, mboost. ∗[email protected] Department of Medical Informatics, Biometry and Epidemiology, University of Munich, Germany
منابع مشابه
Evaluation of risk factors of recurrence of hodgkin\'s lymphoma using random survival forest and comparison with cox regression model
Background: In many studies, Cox regression was used to assess the important factors that affect the survival of cancer patients based on demographic and clinical variables. The aim of this study was to determine the factors affecting the survival of patients with Hodgkin's lymphoma using the random survival forest (RSF) method and compare it with the Cox model. Methods: In this retrospective ...
متن کاملThe Effect of Time-dependent Prognostic Factors on Survival of Non-Small Cell Lung Cancer using Bayesian Extended Cox Model
Abstract Background: Lung cancer is one of the most common cancers around the world. The aim of this study was to use Extended Cox Model (ECM) with Bayesian approach to survey the behavior of potential time-varying prognostic factors of Non-small cell lung cancer. Materials and Methods: Survival status of all 190 patients diagnosed with Non-Small Cell lung cancer referring to hospitals in ...
متن کاملIdentification of Factors Affecting Metastatic Gastric Cancer Patients’ Survival Using the Random Survival Forest and Comparison with Cox Regression Model
Background and Objectives: In survival analysis, using the Cox model to determine the effective factors requires the assumptions whose failure of leads to biased results. The aim of this paper was to determine the factors affecting the survival of metastatic gastric cancer patients using the non-parametric method of Randomized Survival Forest (RSF) model and to compare its result with the Cox m...
متن کاملComparison of Maximum Likelihood Estimation and Bayesian with Generalized Gibbs Sampling for Ordinal Regression Analysis of Ovarian Hyperstimulation Syndrome
Background and Objectives: Analysis of ordinal data outcomes could lead to bias estimates and large variance in sparse one. The objective of this study is to compare parameter estimates of an ordinal regression model under maximum likelihood and Bayesian framework with generalized Gibbs sampling. The models were used to analyze ovarian hyperstimulation syndrome data. Methods: This study use...
متن کاملOn Model-Based Clustering, Classification, and Discriminant Analysis
The use of mixture models for clustering and classification has burgeoned into an important subfield of multivariate analysis. These approaches have been around for a half-century or so, with significant activity in the area over the past decade. The primary focus of this paper is to review work in model-based clustering, classification, and discriminant analysis, with particular attenti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015